Convolutional Neural Networks (CNNs) have achieved state-of-the-artperformance on a variety of computer vision tasks, particularly visualclassification problems, where new algorithms reported to achieve or evensurpass the human performance. In this paper, we examine whether CNNs arecapable of learning the semantics of training data. To this end, we evaluateCNNs on negative images, since they share the same structure and semantics asregular images and humans can classify them correctly. Our experimental resultsindicate that when training on regular images and testing on negative images,the model accuracy is significantly lower than when it is tested on regularimages. This leads us to the conjecture that current training methods do noteffectively train models to generalize the concepts. We then introduce thenotion of semantic adversarial examples - transformed inputs that semanticallyrepresent the same objects, but the model does not classify them correctly -and present negative images as one class of such inputs.
展开▼